Skip to content

Conversation

dvrogozh
Copy link
Contributor

@dvrogozh dvrogozh commented Aug 22, 2025

Changes:

  • Defined stand alone class for the filter graph (based on definition in CpuDeviceInterface)
  • Generalized filter graph to:
    • Cover HW backends (by accepting HW frames context)
    • Support configurable filter pipelines (by accepting string pipeline description). For example, it's possible to pass VAAPI or CUDA specific filters:
FiltersContext.filters = "scale=480:270:sws_flags=bilinear"; # for CPU (current usage)
FiltersContext.filters = "scale_vaapi=480:270:format=rgba";  # for VAAPI

These changes will allow:

  • Easily define ffmpeg driven decoding + video processing pipeline for any backend supported by ffmpeg (for multi-gpu support)
  • Add ffmpeg media filters support to torchcodec (any filter, not only scale or color conversion)

See the following draft PR with the example of using the defined class for ffmpeg-vaapi backend to enable Intel GPU support:

I believe it can be used as a close reference to implement similar support for ffmpeg-cuda filters (scale_cuda, scale_npp).

CC: @scotts @NicolasHug @eromomon

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Aug 22, 2025
@dvrogozh
Copy link
Contributor Author

Pushed a fix for the extra ; typo which was failing some ci checks.

@dvrogozh
Copy link
Contributor Author

Latest CI errors don't seem to be related with the PR. 2 errors:

Error: Unable to download artifact(s): Artifact not found for name: pytorch_torchcodec__3.9_cu126_x86_64

and

/Users/ec2-user/runner/_work/torchcodec/torchcodec/pytorch/torchcodec/src/torchcodec/_core/custom_ops.cpp:87:47: error: must specify at least one argument for '...' parameter of variadic macro [-Werror,-Wgnu-zero-variadic-macro-arguments]
  TORCH_INTERNAL_ASSERT(tensor.is_contiguous());
                                              ^
/Users/ec2-user/runner/_work/_temp/conda_environment_17164967222/lib/python3.10/site-packages/torch/include/c10/util/Exception.h:425:9: note: macro 'TORCH_INTERNAL_ASSERT' defined here
#define TORCH_INTERNAL_ASSERT(cond, ...)                                         \
* Getting build dependencies for wheel...

I believe the second error is addressed already in #847, so I will need to rebase. Not sure if the first one is addressed. @scotts : should I rebase now or you need time to fix 1st issue?

@dvrogozh
Copy link
Contributor Author

Rebased on top of latest master. I hope this will help ci to pass as some fixes for ci has recently landed.

Copy link
Member

@NicolasHug NicolasHug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for the PR @dvrogozh , this looks great! I made a few comments below from a first pass, LMK what you think.

AVRational inputAspectRatio = {0, 0};
int outputWidth = 0;
int outputHeight = 0;
AVPixelFormat outputFormat = AV_PIX_FMT_NONE;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the moment, in main, we are hard-cording the output format to AV_PIX_FMT_RGB24. Do we expect the outputFormat to be used differently in the future (maybe in the CUDA 10bit PR?) ?
I'm asking because if not, maybe we should still hard-code the value.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we expect the outputFormat to be used differently in the future (maybe in the CUDA 10bit PR?) ?

Yes, it will be different for HW filters:

  • For CPU, outputFormat = AV_PIX_FMT_RGB24 (or other output format if needed)
  • For CUDA, outputFormat = AV_PIX_FMT_CUDA (and actual format will be set via filter option: scale_cuda=format=nv12)
  • For XPU VAAPI, outputFormat = AV_PIX_FMT_VAAPI (and actual format will be set via filter option: scale_vaapi=format=rgba)

Dmitry Rogozhkin and others added 3 commits August 29, 2025 21:49
FFmpeg filter graphs allow to cover a lot of use cases including
cpu and gpu usages. This commit moves filter graph support out of
CPU device interface which allows flexibility in usage across
other contexts.

Signed-off-by: Dmitry Rogozhkin <[email protected]>
@dvrogozh
Copy link
Contributor Author

@NicolasHug, I've addressed your comments. One thing worth noting is that I've introduced SwsFrameContext as SWS indeed does not need all the fields from FiltersContext and having 2 separate definitions will actually simplify the code in the future PR when creation of filter graph will be moved to SingleStreamDecoder. Please, take a look and comment whether you are like such approach better?

Copy link
Member

@NicolasHug NicolasHug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @dvrogozh. I've got minor Qs on best practices, I think @scotts if out today so we can merge this now and follow-up if needed.

Thanks a lot for the PR @dvrogozh !

Comment on lines +102 to +106
swsFrameContext.inputWidth = avFrame->width;
swsFrameContext.inputHeight = avFrame->height;
swsFrameContext.inputFormat = frameFormat;
swsFrameContext.outputWidth = expectedOutputWidth;
swsFrameContext.outputHeight = expectedOutputHeight;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious if we should prefer using aggregate initialization here, instead of setting each field individually. I think I have a preference for aggregate initialization because if we add a new field to the SwsFrameContext struct, we'd get a loud compilation error if we forget to initialize the field (which is good).

@scotts any pref?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I generally prefer objects to be fully specified on construction, rather than default-constructed and then members set one-by-one. But I don't think aggregate initialization will save use here from forgetting a field, as I think that for types with a default, it will just use that.

What I think we should do is actually just create a constructor for all of our structs. And to make sure we don't miss any members, we should remove the default constructor. (When possible. If we're creating an array of something we won't be able to do that.)


std::stringstream filters;
filters << "scale=" << expectedOutputWidth << ":" << expectedOutputHeight;
filters << ":sws_flags=bilinear";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's unfortunate that we now have to re-create the string for every single frame in order to compare the filtersContext objects. Maybe eventually we should separate the concerns between filter-graph creation parameters and comparison operators. But I guess for now this is cheap enough.

@NicolasHug NicolasHug merged commit ec878eb into pytorch:main Sep 2, 2025
47 of 50 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Meta Open Source bot.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants